Skip to content

Conversation

ivancea
Copy link
Contributor

@ivancea ivancea commented Oct 8, 2025

This PR covers until right before the query setting is accessible from the Configuration.

PR roadmap

  • Convert the query settings into something similar to the cluster setting objects, having an automatic parser to convert them to values and so on (Example)
    • Reusing those classes look not too straightforward, as we use ESQL Expressions, not strings
    • Add a parser and a default value. Because they're generic, I had to transform the enum into static fields
    • Organize tests to easily add more settings there
  • (Next PR) Add the values to the Configuration object, with a simple API like configuration.getSetting(QuerySettings.TIME_ZONE)
    • Refactored the configuration so it's created after the query parsing
    • (Next PR) For that, the QuerySettings class must be visible from wherever it's used, or the constants should be localized in some common place. Check this!
    • (Next PR) TransportVersion to share them

Copy link
Contributor

github-actions bot commented Oct 8, 2025

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

# Conflicts:
#	server/src/main/resources/transport/upper_bounds/9.3.csv
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/session/EsqlSession.java
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments

return new QuerySettingsMap(settings);
}

public static class QuerySettingsMap implements Writeable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The QuerySettings class has become a class without instances. Why not re-use it for the map, rather than having a separate class that holds the map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll try to do that (in the next PR)


private final Map<String, Expression> settings;

public QuerySettingsMap(Map<String, Expression> settings) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Any reason for this not being a Map<QuerySettingsDef, Expression>? We could back this with an enum map to avoid looking things up by string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the Def has validation, parsing... It's similar to Settings, but for ESQL.
I created this class just to be transported to datanodes: Plain data, to be read by datanodes using their Defs.

As I'm removing transport for this PR, I'll rethink a bit those topics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine and the API you made is cool.

But, maybe looking up by string will be a little heavy; the querysettingsdefs are kinda just an enum and could be assigned ids as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could store it in the configuration later as the map you comment yes. Even without IDs, as they're static objects. It would matter at transport time though, but we're free to serialize as strings in there if we want.

So yep, I'll make this change in the next PR!

}
}

public <T> T get(QuerySettingDef<T> def, RemoteClusterService clusterService) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to pass in the RCS every time sounds wrong. Isn't this something that we'd only do once, when the query is first parsed and we read the QuerySetting values from the parsed query?

That is, I'd expect that we use the RCS once while building the QuerySettingsMap, and after, the settings are fixed and passed around.

Otherwise, we'd have to pass the RCS to all functions that might want to get a setting value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same. Theoretically, it will depend on how we store this in the config: If we store the original Expression, or the final value. This is still a bit abstract, as no actual config uses that yet though, so I'm not sure if this is required here to begin with. I could remove it from get() and keep it only in validate().
@luigidellaquila, some idea of what we need that for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or the final value
I think we should only store the final (Literal or <T>) value unless there's a good reason for settings to be complex enough that they have a dynamic value that can change where the setting is used. But even then, I think it still should be a final value and whatever is dynamic about it should be retrieved on the node where it's relevant.

I think the flow should be

  1. raw parsing from the query string
  2. validate + parse the actual settings, use RCS or pass in whatever else the settings need to be validated or parsed
  3. store the parsed result in a map that is passed on as part of the config.

"esql_configuration_query_settings"
);

private final Map<String, Expression> settings;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd be a little more relaxed if this was restricted to Literals rather than arbitrary expressions ^^" Having an expression as setting sounds more complex than we currently need.

@ivancea ivancea marked this pull request as ready for review October 10, 2025 11:08
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)


package org.elasticsearch.xpack.esql.plugin;

public record EsqlQueryClusterSettings(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a bundle of the required cluster settings used by the EsqlSession to create the configuration. We could also pass the cluster settings themselves and let the EsqlSession do it itself, but I wasn't sure if we wanted that. I'll evaluate the change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major changes here are:

  • We now don't receive the config nor the optimizers in the constructor
  • We instead create them in execute()
  • So, we have to propagate them through methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Configuration" object wasn't being used here, at all. So I removed it. And this led to many changes along tests.
That's good btw, as otherwise creating the config after parsing would have been quite harder

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I think the refactoring is super nice, thank you!

sessionId,
configuration,
foldCtx,
new EsqlQueryClusterSettings(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is more appropriate as name:

Suggested change
new EsqlQueryClusterSettings(
new AnalyzerSettings(

Or QueryTruncationSettings or so.

We also have PlannerSettings that are obtained from the ClusterService.


private final Map<String, Expression> settings;

public QuerySettingsMap(Map<String, Expression> settings) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine and the API you made is cool.

But, maybe looking up by string will be a little heavy; the querysettingsdefs are kinda just an enum and could be assigned ids as well.

}
}

public <T> T get(QuerySettingDef<T> def, RemoteClusterService clusterService) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or the final value
I think we should only store the final (Literal or <T>) value unless there's a good reason for settings to be complex enough that they have a dynamic value that can change where the setting is used. But even then, I think it still should be a final value and whatever is dynamic about it should be retrieved on the node where it's relevant.

I think the flow should be

  1. raw parsing from the query string
  2. validate + parse the actual settings, use RCS or pass in whatever else the settings need to be validated or parsed
  3. store the parsed result in a map that is passed on as part of the config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants